Jointly Modeling Topics and Intents with Global Order Structure

نویسندگان

  • Bei Chen
  • Jun Zhu
  • Nan Yang
  • Tian Tian
  • Ming Zhou
  • Bo Zhang
چکیده

Modeling document structure is of great importance for discourse analysis and related applications. The goal of this research is to capture the document intent structure by modeling documents as a mixture of topic words and rhetorical words. While the topics are relatively unchanged through one document, the rhetorical functions of sentences usually change following certain orders in discourse. We propose GMM-LDA, a topic modeling based Bayesian unsupervised model, to analyze the document intent structure cooperated with order information. Our model is flexible that has the ability to combine the annotations and do supervised learning. Additionally, entropic regularization can be introduced to model the significant divergence between topics and intents. We perform experiments in both unsupervised and supervised settings, results show the superiority of our model over several state-of-the-art baselines.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Document Clustering and Topic Modeling

Document clustering and topic modeling are two closely related tasks which can mutually benefit each other. Topic modeling can project documents into a topic space which facilitates effective document clustering. Cluster labels discovered by document clustering can be incorporated into topic models to extract local topics specific to each cluster and global topics shared by all clusters. In thi...

متن کامل

Multi-field Correlated Topic Modeling

Popular methods for probabilistic topic modeling like the Latent Dirichlet Allocation (LDA, [1]) and Correlated Topic Models (CTM, [2]) share an important property, i.e., using a common set of topics to model all the data. This property can be too restrictive for modeling complex data entries where multiple fields of heterogeneous data jointly provide rich information about each object or event...

متن کامل

Quantum Chemical Modeling of N-(2-benzoylphenyl)oxalamate: Geometry Optimization, NMR, FMO, MEP and NBO Analysis Based on DFT Calculations

In the present work, the quantum theoretical calculations of the molecular structure of the (N-(2-benzoylphenyl) oxalamate has been investigated and are evaluated using Density Functional Theory (DFT). The geometry of the title compound was optimized by B3LYP method with 6-311+G(d) basis set. The theoretical 1H and 13C NMR chemical shift (GIAO method) values of the title compound are calculated...

متن کامل

Enumerating Pseudo-Intents in a Partial Order

The enumeration of all the pseudo-intents of a formal context is usually based on a linear order on attribute sets, the lectic order. We propose an algorithm that uses the lattice structure of the set of intents and pseudo-intents to compute the Duquenne-Guigues basis. We argue that this method allows for efficient optimizations that reduce the required number of logical closures. We then show ...

متن کامل

Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogue Systems

Various smart devices (smartphone, smart-TV, in-car navigating system, etc.) are incorporating spoken language interfaces, as known as spoken dialogue systems (SDS), to help users finish tasks more efficiently. The key role in a successful SDS is a spoken language understanding (SLU) component; in order to capture language variation from dialogue participants, the SLU component must create a ma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016